ftp.cs.arizona.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.cs.arizona.edu / ftp.cs.arizona.edu.tar / ftp.cs.arizona.edu / icon / newsgrp / group99a.txt / 000131_icon-group-sender _Tue Jun 8 17:06:25 1999.msg < prev next >

Wrap

Internet Message Format | 2000-09-20 | 8KB

Return-Path: <icon-group-sender> Received: (from root@localhost) by baskerville.CS.Arizona.EDU (8.9.1a/8.9.1) id RAA00220 for icon-group-addresses; Tue, 8 Jun 1999 17:05:52 -0700 (MST) Message-Id: <199906090005.RAA00220@baskerville.CS.Arizona.EDU> Delivered-To: icon-group@silliac.cs.arizona.edu Date: Tue, 8 Jun 1999 15:41:22 -0500 (CDT) From: "John C. Paolillo" <johnp@ling.uta.edu> To: CHETHCOA@oss.oceaneering.com, icon-group@optima.CS.Arizona.EDU Subject: Re: Fwd: Re: File Translators in Icon Cc: john@ling.uta.edu Errors-To: icon-group-errors@optima.CS.Arizona.EDU Status: RO I don't know if this will help, but I once had reason to write the following Icon program to assist in typesetting Feature-Attribute matrices in MS Word. In MS Word for mac v 4.0 and 5.0, there is a family of mathematical typesetting formulas (listed unde "Formula" in the user manuals) which employ a special typesetting character (that turns out to be ASCII 006) as an escape character. The formulas are acceptably typeset (or they were for my purposes) but very cumbersome to type. I wrote this program to translate from a more transparent syntax into the form actually used by MS Word. As for newer versions of MS Word, I don't know if you can type the formula typesetting sequences directly (I couldn't figure out how to do it), but my files with the complex formulas in them that were created in MS Word v 4.0 still display correctly when converted. I haven't tried this in a long time, so I am not 100% certain if this is the best version. There are probably improvements that can be made to it at any rate. Some of the typesetting that is possible (e.g. integrals) are not implemented, but that could easily be added by referring to the appropriate pages in the Word manuals. The program was written to be used as a clipboard filter with the macintosh interpreter IClip. John C. Paolillo Program in Linguistics University of Texas at Arlington ########################################################################## # # This program parses a description of a attribute-value matrix using # ordinary ascii characters (using upper and lowercase letters for # feature values and labels, [] to delimit matrices, {} to delimit sets # and <> to delimit lists), and outputs a string of the appropriate # description, using MS Word's formula typesetting expressions. The # resulting string, when pasted into a Word document, will appear in # the correctly typeset form. For example, the following description # # [ [ this | is ] # [ an | example # of | typesetting] ] # # Will be typeset as below (ascii impression). # # +- -+ # | [ this | is ] | # | +- -+ | # | | an | example | | # | | of | typesetting | | # | +- -+ | # +- -+ # # The program will parse as many such expressions as are in the input, # until it encounters a non well-formed description. All remaining # input (well formed or not) is discarded. # ######################################################################## global white_sp # white space character set procedure main() white_sp := ' \t' # white space is tab and space line := "" while line ||:= read() # concatenate input into one long string # (removes return/new line characters) line ? # and scan the line for parsing { while a_parse := parse() do { # write each parse as a separate line write(lst_2_str(a_parse)) } } end # This procedure converts an input list (a parse) into a formatted # string. Many variations are possible. This one replaces the # string, list and set delimiter characters with the appropriate # typesetting commands. The "" character below is ascii 006, Word's # formula typesetting command character. Using backslash to represent # this character, the strings replacing the delimiters are: # # [ \b\bc\[(\a\al( # begin a bracket delimited by "[]" # # with a left-alligned array as argument # # < \b\bc\<(\a\ac( # begin a bracket delimited by "<>" # # with a center-alligned array as argument # # { \b\bc\[(\a\ac( # begin a bracket delimited by "{}" # # with a center-alligned array as argument # # ] )) # close two argument lists # > )) # } )) # procedure lst_2_str(alist) if x := string(alist) then # don't convert it if its already a string return case x of { # do the above replacements "[" : "bbc[(aal(" "<" : "bbc<(aac(" "{" : "bbc{(aac(" "]" : "))" ">" : "))" "}" : "))" default : x # default to the incoming value } else # if it's a list then convert it { mid_str := "" # mnemonically "middle string" every x := !alist do { # iterate through the arguments mid_str ||:= lst_2_str(x) # concatenate mid_str ||:= " " # pad with space } return mid_str[1:-1] # don't include the last space } end # the parser looks for each of the following possibilities # label -- a string of letters # expr -- a label followed by a value # box -- labels with values inside [] # l_list -- boxes grouped inside <> # set -- boxes grouped inside {} procedure parse() suspend label() | expr() | box() | set() | l_list() end # a label is a bunch of letters together with no whitespace or punctuation # no digits are allowed procedure label() suspend tab(many(&letters)) end # an expression is a label with a value. The value may be # a label (separated by "|"), a box, a set a list or another expression procedure expr() suspend [ one_of(label) ] ||| ( [ ="|", one_of(label) ] | [ one_of( box | set | l_list | expr ) ] ) end # a box is some arbitrary number of expressions enclosed in [] # the parsed expressions will have "," items between them in the parse # so that Word will know to typeset them vertically in the array. # DO NOT use the commas in your description, it will not parse if you do. # box()es must have at least one exprtession in them procedure box() suspend [ ="[" ] ||| com_arbno(expr) ||| [ ="]" ] end # a set is an arbitrary number of boxes enclosed in {} -- the same # caveats apply as for box() procedure set() suspend [ ="{" ] ||| com_arbno(box) ||| [ ="}" ] end # a l_list is an arbitrary number of boxes enclosed by <>, with no commas # between them so Word will know to typeset them horizontally. There # may be zero items in a l_list. procedure l_list() suspend [ ="<" ] ||| arbno(box) ||| [ =">" ] end # com_abrno() produces a list containing an arbitrary number of # some matching expression p, separated by one fewer "," items # on the list than there are p's. This is for things that need to # be typeset vertically by Word. The base case is a single p, with # no commas in the putput list. procedure com_arbno(p) suspend [one_of(p)] ||| ( [] | ( [ "," ] ||| arbno(p) ) ) end # arbno() produces a list containing an arbitrary number of some # matching procedure p. This is for things that need to be typeset # horizontally. The base case is an empty list procedure arbno(p) suspend [] | [one_of(p)] ||| arbno(p) end # one_of() produces a single match of the expression p, discarding # any surrounding whitespace it encounters. This is a useful technique # to use in a parser, with a little modification, it can also be # used to get optional constituents. procedure one_of(p) tab(many(white_sp)) x <- p() tab(many(white_sp)) suspend \x end